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In this paper we propose a general methodology, based on mul- 
tiple testing, for testing that the mean of a Gaussian vector in R n 
belongs to a convex set. We show that the test achieves its nominal 
level, and characterize a class of vectors over which the tests achieve 
a prescribed power. In the functional regression model this general 
methodology is applied to test some qualitative hypotheses on the re- 
gression function. For example, we test that the regression function is 
positive, increasing, convex, or more generally, satisfies a differential 
inequality. Uniform separation rates over classes of smooth functions 
are established and a comparison with other results in the literature 
is provided. A simulation study evaluates some of the procedures for 
testing monotonicity. 

1. Introduction. 

1.1. The statistical framework. We consider the following regression model: 

(1) Yi = F(xi) + oei, i = l,...,n, 

where x\ < %i < ■ ■ ■ < x n are known deterministic points in [0, 1], a is an 
unknown positive number and (ej)j=i,..., n is a sequence of i.i.d. unobserv- 
able standard Gaussian random variables. From the observation of Y = 
(Yi, . . . ,Y n )' , we consider the problem of testing that the regression function 
F belongs to one of the following functional sets /C: 

(2) /C> = {F : [0, 1] —> R, F is nonnegative}, 

(3) K / = {F: [0, 1] -> R, F is nondecr easing}, 
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(4) K^ = {F: [0, 1] -> R, F is nonconcave}, 

(5) /C r , K =|F:[0,l]^]R, VxG[0,l], £-[R( x )F(x)} > oj. 

In the above definition of ZC rj _R, r denotes a positive integer and R a smooth, 
nonvanishing function from [0, 1] into M. Choosing the function R equal 
to 1 leads to test that the derivative of order r is positive. Taking r = 1 
and choosing a suitable function R leads to test that a positive function 
F is decreasing at some prescribed rate. It is also possible to test that F 
belongs to some classes of smooth functions. These testing hypotheses will 
be detailed in Section 3. 

The problem is therefore to test some qualitative hypothesis on F. We 
shall show that it actually reduces to testing that the mean of the Gaussian 
vector Y belongs to a suitable convex subset of W 1 . Denoting by (•,•) the 
inner product of W 1 , this convex subset takes the form 

C = {f£R n , Vje{l,..., P } <f, Vi )<0}, 

where the vectors {vi,...,v p } are linearly independent in M n . The aim of 
this paper is to present a general methodology for the problem of testing 
that f belongs to C and to characterize a class of vectors over which the tests 
achieve a prescribed power. This general methodology is applied to test that 
the regression function F belongs to one of the sets /C. For the procedures 
we propose, the least-favorable distribution under the null hypothesis is 
achieved for F = and a = 1. Consequently, by carrying out simulations, 
we easily obtain tests that achieve their nominal level for fixed values of 
n. Moreover, we show that these tests have good properties under smooth 
alternatives. 

For the problem of testing positivity, monotonicity and convexity, we ob- 
tain tests based on the comparison of local means of consecutive observa- 
tions. A precise description of these tests is given in Section 2. For the prob- 
lem of testing monotonicity, our methodology also leads to tests based on 
the slopes of regression lines on short intervals, as explained in Section 3.1. 
These procedures, based on "running gradients," are akin to those proposed 
by Hall and Heckman (2000). For the problem of testing that F belongs 
to KL T ^n with a nonconstant function R we refer the reader to Section 3.2. 
We have delayed the description of the general methodology for testing that 
f belongs to C to Section 4. Simulation studies for testing monotonicity 
are shown in Section 5. The proofs are postponed to Sections 6-9 and the 
Appendix. 

1.2. An overview of the literature. In the literature tests of monotonic- 
ity have been widely studied in the regression model. The test proposed 
by Bowman, Jones and Gijbels (1998) is based on a procedure described in 



TESTS FOR CONVEX HYPOTHESES 



3 



Silverman (1981) for testing unimodality of a density. This test is not pow- 
erful when the regression is fiat or nearly flat, as emphasized by Hall and 
Heckman (2000). Hall and Heckman (2000) proposed a procedure based on 
"running gradients" over short intervals for which the least-favorable dis- 
tribution under the null, when a is known, corresponds to the case where 
F is identically constant. The test proposed by Gijbels, Hall, Jones and 
Koch (2000) is based on the signs of differences between observations. The 
test offers the advantage to not depend on the error distribution when it 
is continuous. Consequently, the nominal level of the test is guaranteed for 
all continuous error distributions. In the functional regression model with 
random x,'s, the procedure proposed by Ghosal, Sen and van der Vaart 
(2000) is based on a locally weighted version of Kendall's tau. The proce- 
dure uses kernel smoothing with a particular choice of the bandwidth, and 
as in Gijbels, Hall, Jones and Koch (2000) depends on the signs of the quan- 
tities (Yj — Yi){xi — Xj). They show that for certain local alternatives the 
power of their test tends to 1. Some comments on the power of our test 
under those alternatives can be found in Section 3.3. In Baraud, Huet and 
Laurent (2003b) we propose a procedure which aims at detecting discrep- 
ancies with respect to the L 2 (// n )-distance where fi n = n~ l Ya=i • This 
procedure generalizes that proposed in Baraud, Huet and Laurent (2003a) 
for linear hypotheses. A common feature of the present paper with these 
two lies in the fact that the proposed procedures achieve their nominal level 
and a prescribed power over a set of vectors we characterize. In the Gaus- 
sian white noise case, Juditsky and Nemirovski (2002) propose to test that 
the signal belongs to the cone of nonnegative, nondecreasing or nonconcave 
functions. For a given r G [1, +oo[, their tests are based on the estimation 
of the L r -distance between the signal and the cone. However, this approach 
requires that the signal have a known smoothness under the null. In the 
Gaussian white noise model, other tests of such qualitative hypotheses are 
proposed by Diimbgen and Spokoiny (2001). Their procedure is based on 
the supremum over all bandwidths of the distance in sup-norm between a 
kernel estimator and the null hypothesis. They adopt a minimax point of 
view to evaluate the performances of their tests and we adopt the same in 
Sections 2 and 3. 

1.3. Uniform separation rates and optimality. Comparison of the per- 
formances of tests naturally arises in the problem of hypothesis testing. In 
this paper, we shall mainly describe the performances of our procedures in 
terms of uniform separation rates over classes of smooth functions. Given 
[3 in ]0, 1[, a class of smooth functions J- and a "distance" A(-) to the null 
hypothesis, we define the uniform separation rate of a test <I> over J 7 , de- 
noted by / o( ( I>, J 7 , A), as the smallest number p such that the test guarantees 
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a power not smaller than 1 — (3 for all alternatives F in T at distance p from 
the null. More precisely, 

(6) p($,F,A) = inf{p>0, VFG T, A(F) > p => P F ($ rejects )> 1 - /?}. 

In the regression or Gaussian white noise model, the word "rate" refers to the 
asymptotics of T, A) = p T ($>, J 7 , A) with respect to a scaling parameter 
r (the number of observations n in the regression model, the level of the noise 
in the Gaussian white noise). Comparing the performances of two tests of the 
same level amounts to comparing their uniform separation rates (the smaller 
the better). A test is said to be optimal if there exists no better test. The 
uniform separation rate of an optimal test is called the minimax separation 
rate. In the sequel, we shall enlarge this notion of optimality by saying that 
a test is rate-optimal over T if its uniform separation rate differs from the 
minimax one by a bounded function of r. Unfortunately, not much is known 
about the uniform separation rates of the tests mentioned in Section 1.2. The 
only exception we are aware of concerns the tests proposed by Diimbgen and 
Spokoiny (2001) and Juditsky and Nemirovski (2002) in the Gaussian white 
noise model (with r = 1/y/n), and Baraud, Huet and Laurent (2003b) in the 
regression model. The rates obtained by Juditsky and Nemirovski (2002) are 
established for the problem of testing that F belongs to /C D H, where Ti is 
a class of smooth functions. In contrast, in the papers by Baraud, Huet and 
Laurent (2003b) and Diimbgen and Spokoiny (2001), the null hypothesis is 
not restricted to those smooth functions belonging to /C. For the problem 
of testing positivity and monotonicity, Baraud, Huet and Laurent (2003b) 
established separation rates with respect to the L 2 (// n )-distance to the null. 
For the problem of testing positivity, monotonicity and convexity, Diimbgen 
and Spokoiny (2001) considered the problem of detecting a discrepancy to 
the null in sup-norm. For any L > 0, their procedures are proved to achieve 
the optimal rate (Llog(n)/n) 1//3 over the class of Lipschitz functions 

H 1 (L) = {F, Vx,yG[0,l], \F(x) - F(y)\ < L\x - y\}. 

The optimality of this rate derives from the lower bounds established by 
Ingster [(1993), Section 2.4] for the more simple problem of testing F = 
against F ^ in sup-norm. More generally, it can easily be derived from 
Ingster's results (see Proposition 2) that the minimax separation rate (in 
sup-norm) over Holderian balls 

H S {L) = {F, Vs,y€[0,l], \F(x) - F(y)\ < L\x - y\ s } 

(7) 

with s e ] 0, 1] 

is bounded from below (up to a constant) by (L 1//s \og{n) /n) s ^ 1+2s \ In the 
regression setting, we propose tests of positivity, monotonicity and convexity 
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whose uniform separation rates over 7i s {L) achieve this lower bound what- 
ever the value of s G ] 0, 1] and L > 0. In this paper, we discuss the optimality 
in the minimax sense over Holderian balls with regularity s in ]0, 1] only. To 
our knowledge, the minimax rates over smoother classes of functions are 
unknown. It is beyond the scope of this paper to describe them. 

For the problem of testing monotonicity or convexity, other choices of 
distance to the null are possible, for example, the distance in sup-norm be- 
tween the first (resp. the second) derivative of F and the set of nonnegative 
functions. For such choices, Diimbgen and Spokoiny also provided uniform 
separation rates for their tests. In the regression setting, the uniform sep- 
aration rates we get coincide with their separation rates on the classes of 
functions they considered. We do not know whether these rates are optimal 
or not either in the Gaussian white noise model or in the regression model. 

2. Tests based on local means for testing positivity, monotonicity and con- 
vexity. We consider the regression model given by (1) and propose tests of 
positivity, monotonicity and convexity for the function F. We first intro- 
duce some partitions of the design points and notation that will be used 
throughout the paper. 

2.1. Partition of the design points and notation. We first define an al- 
most regular partition of the set of indices {1, . . . , n} into £ n sets as follows: 
for each k in {1, . . . ,£„} we set 



Then for each £ E {1, . . . ,£ n }, we make a partition of {1, . . . ,n} into £ sets 
by gathering consecutive sets J^. This partition is defined by 



We shall use the following notation. 

(a) We use a bold type style for denoting the vectors of W 1 . We endow 
W 1 with its Euclidean norm denoted by || • ||. 

(b) For v G IR n , let HvHoo = maxi<,;< n \vi\. 

(c) For a linear subspace V of M. n , Uy denotes the orthogonal projector 




and define the partition as 



J e » = {J k ,k = {l,...,£ n }}. 




onto V. 
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(d) For a G M + , D G N\{0} and u G [0, 1], and X~£ a 2( u ) denote the 
1 — it quantile of, respectively, a standard Gaussian random variable and a 
noncentral x 2 with degrees of freedom and noncentrality parameter a 2 . 

(e) For x G M, [x] denotes the integer part of x. 

(f) For each R n - vector v and subset J of {l,...,n}, we denote by vj 
the ]R n -vector whose coordinates coincide with those of v on J and vanish 
elsewhere. We denote by vj the quantity J2i£j v i/\J\- 

(g) We denote by 1 the M n -vector (1, . . . , 1)' and by e.- L the ith vector of 
the canonical basis. 

(h) We define V n . cs te as the linear span of {lj, J G i7^ n }. Note that the 
dimension of V^ jCSte equals £ n . 

(i) The vector e denotes a standard Gaussian variable in R n . 

(j) We denote by Pf j(J the law of the Gaussian vector in W 1 with expec- 
tation f and covariance matrix a 2 I n , where I n is the n x n identity matrix. 
We denote by P_p j(T the law of Y under the model defined by (1). 

(k) The level a of all our tests is chosen in ] 0, l/2[. 



2.2. Test of positivity. We propose a level-a test for testing that F be- 
longs to /C>o defined by (2). The testing procedure is based on the fact that 
if F is nonnegative, then for any subset J of {1, . . . ,n} the expectation of 
Y j is nonnegative. For £ G {1, . . . , £ n }, let Tf(Y) be defined as 

and let q\(£,u) be the 1 — u quantile of the random variable Tf(e). We 
introduce the test statistic 

(9) T aA = max {T((Y) - qi (£,u a )}, 

te{i,...,M 

where u a is defined as 

(10) u a = sup(uG]0,l[,P( max {Tf (e) - qUi, it)} > ] < a 

I \£e{i,...,i„} J 

We reject that F belongs to /C>o if T a ^ is positive. 

Comment. When £ increases from 1 to £ n , the cardinality of the sets 
J G J decreases. We thus take into account local discrepancies to the null 
hypothesis for various scales. 



2.3. Testing monotonicity. We now consider the problem of testing that 
F belongs to K / defined by (3). The testing procedure relies on the following 
property: if I and J are two subsets of {l,2,...,n} such that I is on the 
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left of J and if F G K, y, then the expectation of the difference Y/ — Yj is 
nonpositive. For £ G {2, . . . ,£ n }, let T| (Y) be defined as 



k n,cste 

where 



JV« = ( 7^7 + 



1 1 \-V2 



\ J i\ \ J j 



and let q2{£,u) be the 1 — u quantile of the random variable We 
introduce the test statistic 

(11) T a , 2 = max {T%(Y)-q 2 (£,u)}, 

ee{2,...,e n } 

where u a is defined as 

(12) « a = sup{«€]0,l[,P( max {iffe) - q 2 (£, u)} > ) < a\. 

I \£€{2,...,l„} J J 

We reject that F belongs to K, ^ if T a ^ is positive. 

2.4. Testing convexity. We now consider the problem of testing that F 
belongs to K,^ defined by (4). The testing procedure is based on the following 
property: if I, J and K are three subsets of {1,2, ... ,n} such that J is 
between I and K and if F £ /C^., then we find a linear combination of Y/, 
Yj and Y^ with nonpositive expectation. Let x = (x±, . . . , x n )' and for each 
l£{3,...,£ n }, l<i<j <k<£, let 

X jl — X w> 
\£ _ J fc J j 
A ijfc — - = 



and 



i t 2 i ^ 2 1 ^ 1/2 



J jl Ki I \ J k\ 



For ££{3,...,£ n }, let 



r|(Y)= max M 



Y; r A^,Y jr (l-A^)Y jf 



i<i<J<l<f^ k ||Y-n yn , cstc Y||/v^^ ' 

and let qs(£,u) be the 1 — u quantile of the random variable T|(e). We 
introduce the test statistic 

(13) r 0i3 = max (T|(Y) - q 3 (£,u a )}, 
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where u a is defined as 

(14) Ua = sup{nG]0,l[,P( / max : {T$(e) - q 3 (e,u)} > o) < a). 

We reject that F belongs to K,^ if T a ^ is positive. 

2.5. Properties of the procedures. In this section we evaluate the perfor- 
mances of the previous procedures under the null and under smooth alter- 
natives. 

Proposition 1. Let (T a ,IC) be either (T a> i,/C> ) or (T Qj2 ,/C^) or 
(T a) 3,/C^). We have 

sup sup FF a (T a > 0) = a. 
<T>0Fe/c 

Assume now that Xi = i/n for all i = l,...,n and £ n = [n/2]. Let us fix 
(3 G ] 0, 1[ and define for each s G ] 0, 1] and L > 

Then for n large enough there exists some constant k depending on a,(3,s 
only such that for all F £ 7i s (L) satisfying 

(15) A(F)= inf WF-GWoo^Kpn 

G&K. 

we have 

F F:(T (T a >0)>l-f3. 

Comment. This result states that our procedures are of size a. More- 
over, following the definition of the uniform separation rate of a test given in 
Section 1.3, this result shows that the tests achieve the uniform separation 
rate p n (in sup-norm) over the Holderian ball 7i s (L). In the following propo- 
sition, we show that this rate cannot be improved at least in the Gaussian 
white noise model for testing positivity and monotonicity. The proof can be 
extended to the case of testing convexity but is omitted here. 

Proposition 2. Let Y be the observation from the Gaussian white noise 
model 

(16) dY{t) = F(t)dt + ^=dW(i) for te [0,1], 

\ n 



where W is a standard Brownian motion. Let 1C be either the set /C>o or 
K, / and let T be some class of functions. For the distance A(-) to K, given 
by (15), we define 

p n (0,^)=infp($,^,A), 
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where p(^,J-~, A) is given by (6) and where the infimum is taken over all 
tests $ of level 3a for testing ll F = 0." We define p n {JC,J-) similarly by 
taking the infimum over all tests <3? of level a for testing " F G /C." The 
following inequalities hold: 

(i) 7//C = /C>o, £/ien 
If K, = /C /■ , £/ien /or some constant k depending on a and (3 only 



a 



p n (0, T) - k—= 



PnilC,^) > - 

(ii) In particular, if T = 7i s {L), for n large enough there exists some 
constant k' depending on a, [3 and s only such that 

(it) p „ (K , m ^ 1 /u+*>(!^y /,I+2 ' ) . 

The proof of the first part of the proposition extends easily to the regres- 
sion framework. The second part (ii), namely (17), derives from (i) and the 
lower bound on p n (0, J 7 ) established by Ingster (1993). 

For the problem of testing the positivity of a signal in the Gaussian white 
noise model, Juditsky and Nemirovski (2002) showed that the minimax sep- 
aration rate with respect to the L r -distance (r £ [l,+oo[) is of the same 
order as p n up to a logarithmic factor. 

3. Testing that F satisfies a differential inequality. In this section, we 
consider the problem of testing that F belongs to /C r ,_R defined by (5). Several 
applications of such hypotheses can be of interest. For example, by taking 
r = 1 and R(x) = — exp(ax) (for some positive number a), one can test that 
a positive function F is decreasing at rate exp(— ax), that is, satisfies 

V a? 6 [0,1] 0<F(x) <F(0)exp(-ax). 

Other kinds of decay are possible by suitably choosing the function R. An- 
other application is to test that F belongs to the class of smooth functions 

{F:[0,1]^K, WF^KL}. 

To tackle this problem, it is enough to test that the derivatives of order 
r of the functions F\(x) = —F(x) + Lx r /r! and F2(x) = F{x) + Lx r jr\ are 
positive. This is easily done by considering a multiple testing procedure 
based on the data — Yi + Lx\/r\ for testing that F\ is positive, and on Yj + 
Lx\/r\ for testing that F2 is positive. 

In Section 3.1 we consider the case where the function R equals 1. The 
procedure then amounts to testing that the derivative of order r of F is 
nonnegative. We turn to the general case in Section 3.2. 

We first introduce the following notation. 



10 



Y. BARAUD, S. HUET AND B. LAURENT 



(a) For w G R n , we denote by R * w the vector whose ith coordinate 
(R-kw)i equals R(xi)wi. 

(b) For fcelf \ {0}, we denote by w fc the R n -vector (u^ , ...,w%), and we 
set w° = 1 by convention. 

(c) For J C {1, . . . , n}, let us define Xj as the space spanned by 1j,xj, . . . , Xj~ . 

3.1. Testing that the derivative of order r of F is nonnegative. In this 
section we take R(x) = 1 for all x G [0, 1]. The procedure relies on the idea 
that if the derivative of order r of F is nonnegative, then on each subset J 
of {1, 2, . . . , n}, the highest degree coefficient of the polynomial regression of 
degree r based on the pairs {(xi,F(xi)),i G J} is nonnegative. For example, 
under the assumption that F is nondecreasing, the slope of the regression 
based on the pairs {(xi,F(xi)),i G J} is nonnegative. 

Let £ n = [n/ (2(r + 1))], let V n be the linear span of {1j,xj, . . . ,Xj, J G 
J tn }, and for each Jc {1, . . . ,n} 



t 



r 



ll x j- n */Xjll 
For each £e{l,.. .,£ n }, let T e (Y) be defined as 

(18) r ' (Y) ^ ||Y <Y Sll ^ 

and let q(£,u) denote the 1 — u quantile of the random variable T e (e). We 
introduce the following test statistic: 

(19) T a = max {T e (Y) — q(£,u a )}, 

£e{i,. ..,£„} 

where u a is defined as 

(20) u Q = sup(uG]0,l[,P( max {T e (e) - q(£,u)} > ) < a\. 

I \£e{l,...,ln} ) ) 

We reject the null hypothesis if T a is positive. 



Comment. When r = 1, the procedure is akin to that proposed by 
Hall and Heckman (2000) where for all £, q(£,u a ) is the 1 — a quantile 

of maxe e { lj ... t i n yT e (e). 

3.2. Extension to the general case. The ideas underlying the preceding 
procedures extend to the case where R^kl. In the general case, the test is 
obtained as follows. 

Let £ n be such that the dimension d n of the linear space 

(21) V n = Span{lj,xj, . . . , x}, R* lj, . . . , R *x}, J G J £ "} 
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is not larger than n/2. We define for each J C {1, . . . , n} 

(22 ) t} = _ ^*(xS-n^xS) where 7J =||i?*(xS- n^x})ll- 

7J 

We reject that F belongs to K, r ^ if T a defined by (19) is positive. 

3.3. Properties of the tests. In this section we describe the behavior of 
the procedure. We start with some notation. 

(a) Let us define the function A(F) as 

HF)(x) = £^[R(x)F(x)], 

and let uj be its modulus of continuity defined for all h > by 
co(h)= sup \A(F)(x) - A(F)(y)\. 

\x—y\<h 

(b) For J G U^=i us denote by (resp. x~j ) the quantities minjxj, i G J} 
(resp. max{xj, i S J}) and set hj = x~j — Xj . 

(c) Let f = (F(xi), . . .,F(x n )Y and for each £ = 1, . . . ,£ n and € ] 0, 1[, 

let 

(23) »*M = (^^sJxn^ [{ -u Vnm Am +^- 1 (/3/2))^- 

(d) For each /0 > 0, let 

£ ft , P (p) = {F:[0,l]->R ) F^eH s (L),- MF^(x)>p). 

I ase[0,l] J 

We have the following result. 

Proposition 3. Let T a be the test statistic defined in Section 3.2. We 
have 

sup sup FF }CT (T a > 0) = a. 

cr>0FG/C r , H 

For each (3 6 ] 0, 1[ we /lave 

lPF, CT (T a >0)>l-/3, 
if for some £ € {1, . . . , i/iere exists a set J E J7" sucft. £/iai either 

(24) - inf : A(F)(x<) > ^(f,/?) „ - ffi r „ 2 + u(hj), 

%eJ ll x j ~ J-^jXjH 

or 

(25) inf -A(F)(x) > ^(f ,/?) - ^ 
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Moreover, if R = 1, then there exists some constant n depending on a,(3,s 
and r only such that for n large enough and for all F £ £ n r(Pnr) with 



Pn,r — K 

we have 



<7 2 log(n)W( 1+2 ( s+r )) 



n 



L (l+2r)/(l+2( s +r)) 



P*>(T a >0)>l-/3. 



Comment 1. In the particular case where R = l, let us give the orders 
of magnitude of the quantities appearing in the above proposition. Under the 
assumption that ||f — ny n f || 2 /ra is smaller than a 2 , one can show that V(_ is of 
order ^/log(n) (see Section 9.2). When R = 1, we have 7j = ||xj — Ii^XjH 
and it follows from computations that will be detailed in the proofs that 



r^j r / log(n) 

for some constant C which does not depend on J or n. 



(26) Mf,Ph — — ^<C\, 



Comment 2. In the particular case where r = 1, (26) allows us to 
compare our result to the performance of the test proposed by Ghosal, 
Sen and van der Vaart (2000). For each 5 E]0, 1/3 [, they give a proce- 
dure (depending on 5) that is powerful if the function F is continuously 
differentiable and satisfies that for all x in some interval of length n _<5 , 
F'(x) < — M-\/log(n)n~( 1 ~ 3<5 )/ 2 for some M large enough. 

By using (25) and the upper bound in (26) with hj of order n~ s , we 
deduce from Proposition 3 that our procedure is powerful too over this class 
of functions. Note that by considering a multiple testing procedure based on 
various scales £, our test does not depend on 5 and is therefore powerful for 
all 5 simultaneously. 



Comment 3. For r = 1 (resp. r = 2) and s = 1, Diimbgen and Spokoiny 
(2001) obtained the uniform separation rate p nr for testing monotonicity 
(resp. convexity) in the Gaussian white noise model. 



Comment 4. For the problem of testing monotonicity (r = 1 and R = 
1), it is possible to combine this procedure with that proposed in Section 2.3. 
More precisely, consider the test which rejects the null at level 2a if one of 
these two tests rejects. The so-defined test performs as well as the best of 
these two tests under the alternative. 
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4. A general approach. The problems we have considered previously re- 
duce to testing that f = (F(x\), . . . ,F(x n ))' belongs to a convex set of the 
form 

(27) C = {fGR n , Vj€{l,...,p} (f,v,) <0}, 

where the vectors {vi,...,v p } are linearly independent in R n . For exam- 
ple, testing that the regression function F is nonnegative or nondecreasing 
amounts to testing that the mean of Y belongs, respectively, to the convex 
subsets of R n 

(28) C> = {f GR n , Vte{l,...,7i} f t >0} 
and 

(29) C /= {fsr, Vie{l,...,n-1} f i+ i-fi>0}. 

Clearly, these sets are of the form given by (27) by taking, respectively, 
p = n, vj = —&j and p = n — 1, v,- = e — ej+i. The following proposition 
extends this result to the general case. Note that one can also define the set 
C as 

C = {f6l", L 1 (f)>0,...,L p (f)>0} ) 

where the Lj's are p independent linear forms. We shall use this definition 
of C in the following. 

Proposition 4. For each r e {1, . . . , n — 1} and i € {1, . . . ,n — r} let 
4>i r be the linear form defined for w G W l by 



<^v(w) = det 



/l Xi ••• X\ 1 Wi \ 
1 ••• x[~\ Wi+1 



If F belongs to /C_, then f = (F(xi), . . . ,F(x n ))' belongs to 
(30) C = {feR n , Vie {1,..., n -2}, &, 2 (f)>0}. 

//F belongs to KL r) R, then f belongs to 

C r , R = {f 6l n , Vt€{l,...,n-r}, ^,r(-R*f)>0}. 

With the aim of keeping our notation as simple as possible, we omit the 
dependence of the linear forms fa r on r when there is no ambiguity. The 
remaining part of the section is organized as follows. In the next section we 
present a general approach for the problem of testing that f belongs to C. 
In the last section we show how this approach applies to the problems of 
hypothesis testing considered in Sections 2 and 3. 
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4.1. Testing that f belongs to C. We consider the problem of testing that 
the vector f = (/i, . . . , f n )' involved in the regression model 

(31) Yi = fi + a£i, i = l,...,n, 

belongs to C defined by (27). Our aim is twofold: first, build a test which 
achieves its nominal level, and second, describe for each n a class of vectors 
over which this test is powerful. 

The testing procedure. The testing procedure relies on the following idea: 
since under the assumption that f belongs to C, the quantities (f , Y^=i 
are nonpositive for all nonnegative numbers Ai,...,A p we base our test 
statistic on random variables of the form (Y, Y^=i ^j v j) f° r nonnegative 
sequences of Aj's. 

We denote by T the subset of R n defined by 

(32) r=|t = ^A i v i , ||t|| = l, A,>0, Vj = l,...,p|. 

Let T n be a finite subset of T such that there exists some linear space V n 
with dimension d n <n containing the linear span of T n . Let {q^(a),t £ T n } 
be a sequence of numbers satisfying 



(s,t) 



sup s/n- d n - -qt(a)) > 

teT„V ||e-lly„e|| 



a. 



(33) 

We reject the null hypothesis if the statistic 

(34) T a = sup (y/n - d n - q t (a)) 

teT n \ ||Y-ny n Y|| J 

is positive. 

Properties of the test. For all (3 £ ] 0, 1[ and each t £ T n let 

(35) v t (f,(5) = (ft(a)^^^^,,,,_n^f||»/«»-^/ 2 ) + ^1^- 

The order of magnitude of vt(f,P) is proved to be ^/log(n) a under the 
assumption that ||/ — riy n /|| 2 /n is smaller than a 2 as is shown in the proof 
of Proposition 1. 

We have the following result. 

Theorem 1. Let T a be the test statistic defined by (34). We have 

(36) supsupP fiCT (T Q > 0) =P 0l i(T a > 0) = a. 
cr>o fee 

Moreover, if there exists t £ T n such that (f, t) > t>t(f,/3), then 

Pf,«r(T Q > 0) > 1 - 0. 
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Comment. The values of the qt(a)'s that satisfy (33) can be easily 
obtained by simulations under Po.i- This property of our procedure lies in 
the fact that the least-favorable distribution under the null is Po i- Note that 
we do not need to use bootstrap procedures to implement the test. 

4.2. How to apply these procedures to test qualitative hypotheses. In the 
sequel, we give the choices of T n and V n leading to the tests presented in 
Sections 2 and 3. 

For the test of positivity described in Section 2.2. We take T n = T n i, 
with T n> i = U^Li Tn,i, where for all £ G {1, . . . ,£ n } 

We take V n = V njCSte . Note that V^ cate is also the linear span of T n ,\. 

For the test of monotonicity described in Section 2.3. Let us define for 
each I G {2, . . . ,£ n } and 1 < i < j < £, 

Note that is such that ||e^ || = 1. We take T n = T n ^, with T n ^ 2 = T^ 2 , 
where 

<2 = (4, l<i<j<£}, 
and we take V n = V n , cs t c - Note that V n contains T n ^- 



For the test of convexity presented in Section 2.4. Let us define for each 
£e{3,...,£ n }, l<i<j<k<£, 

(38) e{jk = N ?jk ( Yl ei ~ X ijkJJE Yl e ' - C 1 - X ijk)yu\ Y. e \ 

Note that N-j k is such that ||e| jfe ]| = 1. We take T n = T n ^, with T n ^ = 
U£=3 T rf,3> where 

<3 = {ef ife , l<Ki<k<£], 
and we take V n = V n cs t e • Note that V n contains T n 3 . 
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For the test of F € IC^r presented in Section 3. We take 



r n>4 = |J Ti where T* = {t}, J E J 1 } 



and V n = Vn^ denned by (21). Note that V n contains T n ^. 

We justify these choices of T n by the following proposition proved in 
Section 7. 

PROPOSITIONS. LetC andT n be either (C> , / 7^,i), (Cy,T n ^), (£~.,7^ 3 ) 
or (C r ,R, %i,4) ■ There exist vi, . . . , v p /or which C is of the form given by (27) 
and for which T defined by (32) contains T n . 

5. Simulation studies. In this section we describe how to implement the 
test for F £ K. y and we carry out a simulation study in order to evaluate 
the performances of our tests both when the errors are Gaussian and when 
they are not. We first describe how the testing procedure is performed, then 
we present the simulation experiment and finally discuss the results of the 
simulation study. 

5.1. The testing procedures. We carry out the simulation study for the 
two testing procedures described in Sections 2.3 and 3.1. In the sequel, the 
procedure based on differences of local means and described in Section 2.3 
is called LM and the procedure based on local gradients defined below (from 
the test statistic given in Section 3.1 with r = 1) is called LG. 

In the case of the procedure LM, we set Tlm = T a ^ defined in (11). For 
each £, the quantiles q2(£,u a ) are calculated as follows. For u varying among 
a suitable grid of values u±, . . . , u m , we estimate by simulations the quantity 



e being an n-sample of M(0, 1), and we take u a as m&x{uj,p(uj) < a}. Note 
that u a does not depend on (xi,i = l,...,n), but only on the number of 
observations n. 

In the case of the procedure LG, the test statistic is defined as follows. 
For each i = 1, . . . ,£ n and for J G J ' , we take 



The space V n reduces to V ni n n , the linear space of dimension 2£ n generated 





xjlj -xj 



xjlj -xj 



by 



{lj,xj,Je JH 
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The test statistic T a takes the form 

T L g = T a ,4 = max { T 1( Y ) ~ 94^, u a )}, 

t — l,...,»Cn 

where for each £ E {1, . . . 

and qi(l,u a ) denotes the 1 — u quantile of the random variable 

The procedure for calculating q^(l, u a ) for i = 2, . . . , i n is the same as the 
procedure for calculating the q2(£,u a ) , s. 



5.2. The simulation experiment. The number of observations n equals 
100, Xi = i/(n + 1), for i = 1, . . . , n, and l n is either equal to 15 or 25. 

We consider three distributions of the errors £i, with expectation zero and 
variance 1. 

1. The Gaussian distribution: £j ~ AA(0, 1). 

2. The type I distribution: £j has density sfx(n + sx), where fx(x) = 
exp{— x — exp(— x)} and where \x and s 2 are the expectation and the 
variance of a variable X with density fx- This distribution is asymmet- 
rical. 

3. The mixture of Gaussian distributions: £i is distributed as ttX\ + (1 — 
tt)X2, where tt is distributed as a Bernoulli variable with expectation 0.9, 
X\ and X2 are centered Gaussian variables with variances, respectively, 
equal to 2.43s and 25s, and n, X\ and X2 are independent. The quantity 
s is chosen such that the variance of £i equals 1. This distribution has 
heavy tails. 

We consider several functions F that are presented below. For each of 
them, we simulate the observations Yi = F(xi) + aei. The values of a 2 and 
of the distance in sup- norm between F and /C y are reported in Table 1: 

d 00 {F,K. / ) = \ sup (F(s)-F{t)). 

0<s<t<l 

Let us comment on the choice of the considered functions. 

(a) Fq{x) = corresponds to the case for which the quantiles q(£,u a ) are 
calculated. 

(b) The function F 1 (x) = 151 x <o.5(2;-0.5) 3 + 0.3(x-0.5) -exp(-250(x- 
0.25) 2 ) presents a strongly increasing part with a pronounced dip around 
x = 1/4 followed by a nearly flat part on the interval [1/2, 1]. 

(c) The decreasing linear function F2(x) = —ax, the parameter a being 
chosen such that a = 1.5a. 
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(d) The function F^(x) = — 0.2exp(— 50(x — 0.5) 2 ) deviates from Fq by a 
smooth dip while the function F^{x) = 0.1cos(67rx) deviates from Fq by a 
cosine function. 

(e) The functions F$(x) = 0.2x + F$(x) and Fq(x) = 0.2x + F^x) deviate 
from an increasing linear function in the same way as F$ and F± do from 
Fo- 

Let us mention that it is more difficult to detect that F§ (resp. Fq) is non- 
increasing than to detect that F3 (resp. F4.) is. Indeed, adding an increasing 
function to a function F reduces the distance in sup-norm between F and 
K, y. This is the reason why the values of a are smaller in the simulation 
study when we consider the functions F5 and Fq. 

In Figure 1 we have displayed the functions F^ for I = 1 , . . . , 6 and for 
each of them one sample simulated with Gaussian errors. The corresponding 
values of the test statistics Tlm and Tlg for a = 5% and £ n = 25 are given. 
For this simulated sample, it appears that the test based on the statistic Tlm 
leads to rejection of the null hypothesis in all cases, while the test based on 
Tlg rejects in all cases except for functions F2 and F4. 

The results of the simulation experiment based on 4000 simulations are 
presented in Tables 2 and 3. 

5.3. Comments on the simulation study. As expected, the estimated 
level of the test calculated for the function Fq(x) = is (nearly) equal to a 
when the errors are distributed as Gaussian variables. 

When i n = 25, the estimated levels of the tests for the mixture and type I 
distributions are greater than a (see Table 2). Let us recall that when £ n is 
large, we are considering statistics based on the average of the observations 
on sets J with small cardinality. Therefore, reducing t n improves the robust- 
ness to a non-Gaussian error distribution. This is what we get in Table 2 

Table 1 
Testing monotonicity: simulated 
functions F , values of a and 
distance in sup-norm between F and 



F 


er 2 




F (x) 


0.01 





F 1 (x) 


0.01 


0.25 


F 2 {x) 


0.01 


0.073 


F 3 (x) 


0.01 


0.1 


F A {x) 


0.01 


0.1 


F 5 (x) 


0.004 


0.06 


F 6 (x) 


0.006 


0.08 




Fig. 1. For each function Ft, £= 1,...,6, the simulated data Yi = Fi(xi) + aei for 
i = 1, . . . ,n are displayed. The errors Ei are Gaussian normalized centered variables. The 
values of the test statistics Tlm and Tlg, with a = 5%, are given for each example. 
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for £ n = 15. It also appears that the method based on the local means is 
more robust than the method based on the local gradients, and that both 
methods are more robust for the type I distribution that is asymmetric but 
not heavy tailed, than for the mixture distribution. 

Except for the function F%, the estimated power is greater for the pro- 
cedure based on the local means than for the procedure based on the local 
gradients (see Table 3). For both procedures the power of the test is larger 
with £ n = 25 than with t n = 15. However, except for the function F±, the 
loss of power is less significant for the procedure based on the local means. 

5.4. Comparison with other work. As expected, the power of our proce- 
dure Tlg for the function F\ is similar to that obtained by Hall and Heckman 
(2000). 

The decreasing linear function F2(x) = —ax has already been studied by 
Gijbels, Hall, Jones and Koch (2000) with a = 3a. They get an estimated 
power of 77%. 

Gijbels, Hall, Jones and Koch (2000) studied the function 0.075F 3 /0.2 
with a = 0.025 and obtained a simulated power of 98%. With the same 



Table 2 

Testing monotonicity: levels of the tests based on Tlm and 





tn 


= 15 


l n 


= 25 


Errors distribution 


Tlm 


Tlg 


Tlm 


Tlg 


Gaussian 
Type I 
Mixture 


0.049 
0.048 
0.064 


0.050 
0.072 
0.117 


0.046 
0.064 
0.093 


0.051 
0.085 
0.180 



Table 3 

Testing monotonicity: powers of the tests based 
on Tlm and Tlg when the errors are Gaussian 





in 


= 15 


in 


= 25 


F 


Tlm 


Tlg 


Tlm 


Tlg 


Fi 


0.85 


0.99 


0.99 


1 


F 2 


0.96 


0.96 


0.99 


0.99 


F 3 


0.99 


0.73 


1 


0.98 


F 4 


0.89 


0.71 


0.99 


0.94 


Fr, 


0.99 


0.69 


0.99 


0.87 


F 6 


0.87 


0.79 


0.98 


0.93 
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function and the same a, we get a power equal to 1, for both procedures 
and for £ n = 15 and t n = 25. 

Gijbels, Hall, Jones and Koch (2000) and Hall and Heckman (2000) calcu- 
lated the power of their test for the function F?(x) = 1 + x — aexp(— 50(x — 
0.5) 2 ) for different values of a and a. When a = 0.45 and a = 0.05, we get a 
power equal to 1 as Gijbels, Hall, Jones and Koch (2000) do. When a = 0.45 
and a = 0.1, we get a power equal to 76% when using the procedure Tlm 
with i n = 25 or i n = 15. Gijbels, Hall, Jones and Koch (2000) got 80% and 
Hall and Heckman (2000) a power larger than 87%. 



6. Proof of Theorem 1. 



Level of the test. We first prove that for all t G T n , qt{a) > 0. Indeed, 
thanks to (33), we have 

(e,t) 



yjn — d r , 



\£-n Vn e 



< 



sup 

teT„ 



■qt(a) >0 
le — Hv„£\ 



<h(a) >0 



< a < 



1 



Since the random variable \/n — d n (e,t)/||e — ny n e|| is symmetric (dis- 
tributed as Student with n — d n degrees of freedom), we deduce that qt(a) 
is positive. In the sequel let us set 



a n = \\Y-U Vn Y\\/^n-d n . 
Since for all f G C and j G {1, . . . ,p}, (f , Vj) < 0, we have that for all t G T n 



4=1 H 2^3=1 A V 3\ 



<0. 



Hence, (Y, t) = (f , t) + a(e, t) < o~(e, t) and therefore for all f G C and a > 0, 
Pf, ff [T a >0]<™ 



sup , 



Qt(a) >0 



< 



a n (e,t) 
— < sup — — 
v teT„9t a 



We now use the following lemma for noncentral x 2 -random variables. 

Lemma 1. For all u > 0, f G R n and a > 

Pf>[<7„ <au] <F ,i[a n <«]. 
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a n < sup 



This lemma states that a noncentral x 2 - r andom variable is stochastically 
larger than a x 2 -random variable with the same degrees of freedom. For a 
proof we refer to Lemma 1 in Baraud, Huet and Laurent (2003a). 

Since T n cV n , the random variables (e,t) for t G T n are independent of 
a n and thus by conditioning with respect to the (e,t)'s and using Lemma 1 
we get 

(e,t) " 
ter n Qt(a). 
= F 0:1 [T a >0] = a. 
The reverse inequality being obvious, this concludes the proof of (36). 

Power of the test. For any f G M n and a > 

Pf, ff (r a < 0) = P f , CT (Vt G T n , (Y, t) < q t {a)a n ). 

Setting 



supsupP fi(J [r Q >0]<Po,i 
o->o fee 



£„(f,/3) 



X. 



n-d„,||f-nv„f|p/ 



2/(j2 (/3/2), 



we have 

Pf lff (a„>x„(f,/3))=/3/2. 
It follows that for all f Gl" and cr > 0, 

Pf )CT (T a < 0) < inf P f)CT «Y,t) < qt(a)x n (f,P)) + P/2 

tkzl n 

< inf P f , ff (<7(e,t) < q t (a)x n (f,P) - (f,t» + /3/2. 

Since ||i|| = 1, (e,t) is distributed as a standard Gaussian variable, and 
therefore Pf )0 -(T a < 0) < /? as soon as there exists t £T n such that 

q t (a)x n (f,(3) - (f,t) < -a^-\(i/2). 

This concludes the proof of Theorem 1. 

7. Proofs of Propositions 4 and 5. Let us denote by X r the set of increas- 
ing sequences of r + 1 indices in {1, . . . , n}, that is, 

(39) l r = {(ii,...,i r+1 ), ii<---<v + i, ij G {l,...,n}}. 

For i = (ii, . . . , i r +i) G X r and v G M n we set 



(40) 



</>i(v) = det 



/ 1 SCti 
1 3? 7 ; 



\ 1 X 'ir + 1 



r— 1 \ 
_r-l 



.r 



'2 



1> 



*2 
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For i= (i, ...,i + r), </>i(v) = (f>i(v), where 4>i(v) is defined by (30). For 
w 1 , . . . , w 9 , q vectors of M. n , we set 

Gram(w 1 , . . . , w 9 ) = det(G) where G = ((w 4 , w j ))i<i,j< q - 

Let us define 

(41) C r ,R = {{ 6l n , ViG J r , i ( J R*f)>O}. 

The proofs of Propositions 4 and 5 rely on the following lemma. 

Lemma 2. The following equalities hold. First, 

(42) C r ^R=C r} R. 

Assume that f = (F(x\), . . . ,F(x n ))' , where F is such that RF is r times 
differentiable. Then for each i G I r there exists some c\ £]xi l ,Xi r+1 [ such 
that 

(43) fr(fl*f) = A(F y° i) ft(x r ). 



For J C {1, . . . ,n} let t j be defined by (22). We have 

(44) -(f,t}) = iv J - 1 Yl MR*f)M* r ), 

iei r n.J r + 1 

where Nj = Gram(l j,xj, . . . ,x r J " 1 )7 J . 

The proof of the lemma is delayed to the Appendix. 

7.1. Proof of Proposition 4. The result concerning /C_- is clear as a func- 
tion F is nonconcave on [0,1] if and only if for all x,y,z in [0,1] with 
x < y < z one has 

/l x F(x)\ 
det 1 y F{y) > 0. 
VI z F(z)J 

Let us now turn to the set IC ry R. First note that the n — r linear forms 
f i — > (pi tr (R-ki) are independent since the linear space 

{feM n , Vie{l,...,n-r}, &, r (.R*f ) = 0}, 

which is generated by 

III I r-l 

ft ' J R* X '"-' J R* X ' 



is of dimension r. Second, the fact that f belongs to C r ^ is a straightforward 
consequence of (43) since under the assumption that F € K, r ,R, A(F)(x) > 
for all x, and since the Vandermonde determinants ^i(x r ) are positive for 
all i £ T r . 
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7.2. Proof of Proposition 5. The result is clear in the case where C>o- 
For the other cases we use the following lemma. 

Lemma 3. Let W be the orthogonal complement of the linear space gen- 
erated by the Vj's for j = 1, . . . ,p. Ift*£W satisfies for all f EC 

(t*-LVt*,f) <0, 

then 

t*-u w t* cT 

||t*-LVt*|| 

Proof. The vector t* — Il^t* belongs to the linear space generated by 
the Vj's and thus one can write g* = t* — LTvyt* = YTj=\ It remains to 
show that the A,-'s are nonnegative. Let us fix jo £ {L, . . . ,p} and choose P° 
in M. n satisfying {P°, Vj) = for all j ^ jo and (f JO , Vj ) < 0. Such a vector 
exists since the Vj's are linearly independent in R n . Clearly f JO belongs 
to C and therefore (f J0 ,g*) = Xj (P°, Vj ) < which constrains A J0 to be 
nonnegative. This concludes the proof of Lemma 3. □ 

Let us consider the case where C = Cy. We apply Lemma 3. In this case 
W is the linear space generated by 1; we get that for all £ E {2, . . . , £ n } and 
1 < i < j < £, ejj satisfies Il^ef ■ = 0. Moreover ||ef -|| = 1 and 

Vfe^ (f,ef i )=iVj(/^-/^)<0. 

1 J 

Let us consider the case where C = In this case, p = n — 2 and for all 
j = l,...,n-2, 

Vj = (xj+i - x j+2 )ej + (x j+2 - Xj)e j+1 + (xj - x j+ i)e j+2 . 

Since ||e|- fe || = 1, by Lemma 3 it is enough to prove that: 

(i) for all fGW, (f,e|. fe ) = 0, 

(ii) for allfeC^, (f,e| ifc )<0. 

First note that for all feR", 

(45) (f , e\ jk ) = Nf jk (fje - - (1 - \t jk )f 4 ). 

Clearly if f = 1 or f = x, (f,e|- fc ) = and since by definition of C_, W is 
the linear space generated by 1 and x, (i) holds true. Let now f E C^. There 
exists some convex function F mapping [xi,cc n ] into R such that F{xi) = fi 
for all i = 1, . . . , n (take the piecewise linear function verifying this property, 
e.g.). Let i < j <k and I E jj. We set 
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Note that < /J ik < 1 and that 

a? i = Mifc*j< + (l-/4fc)*j£- 
Since F is convex on [xi,x n ], we have for all I G Jj , 

F(x,) < /4^(*/*) + (1 - /4)^(x,0 < ^fcfy + (1 " /4)f/f ■ 

i k z k 

Note that J2ieJ £ tAk/l^jl = ^fj'fc- We derive from the above inequality that 

fj* = rw\ E ^ A ^ + C 1 - a ^)/j|> 

which, thanks to (45), leads to (ii). 

Let us consider the case where C = C r< R. By Lemma 2 we know that 
Cr,R = Cr,R and therefore for each i G X r , the linear form f 0;(i?*f ) is a lin- 
ear combination of the linear forms f i— > 0j(i?*f ) with i = 1, . . . ,n — r. Conse- 
quently, if w G W, then for all i G Z r , (f>i(R-k w) = 0. For each J C {1, . . . , n}, 
tj defined by (22) satisfies ||tj|| = 1. By applying (44) with f = w, we get 
(w,tj) < for all w G C r ^R and (w,t}) = for all w G W. Consequently, by 
Lemma 3, tj belongs to T. 

8. Proof of Proposition 1. 

8.1. Proof for (T a ,C) = (T at i,C>o). We prove the proposition by apply- 
ing Theorem 1. We decompose the proof into six steps. 

Step 1. For all integer N > 1, let T^ 1 (u) denote the 1 — u quantile 
of a Student random variable with N degrees of freedom. We have for all 
UG]0,1[, 

(46) T„V)<l + 4o^(I) +log V 2 (I) exp (3 log (I))} 

/or some absolute constant C > 0. 

Proof. Let F^(u) denote the 1 — u quantile of a Fisher variable with 
one and N degrees of freedom. Then 

f N 1 (u) = ^/F 1 - 1 N (u). 

It follows from Lemma 1 in Baraud, Huet and Laurent (2003a) that for all 
u€]0,l[, N> 1, 

*T» < 1 + 2V21og'/ 2 (I) + ^{exp( 1 log(i)) - l}. 
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Using the inequality exp(x) — 1 < xexp(x) which holds for all x > 0, we 
obtain 

F7» < 1 + 2^2 log 1 /^ (T) + 6 log (I) exp (± log Q) ) , 
and since y/a + b < -^/a + \/& for all a > and > 0, 

^< 1 + c { log v 4 (I) +log v 2 (I) rap (| log (I 

for some absolute constant C > 0. □ 
Step 2. For a/Z I G {1, . . . ,l n }, t G 7^ 



(47) ft (a) = <?i(^a) < C(a)Vlog(n). 

Proof. On the one hand, by definition of qi(i, •), 

a = P ,i(T a ,i > 0) < ^F(T/(£) - > 0) < £ n 



and thus 

(48) u a > a/£ n . 

On the other hand, for all £ E {1, ... , £ n } and J G , the random variables 



e - Uv n £\\ y | J\ 

being distributed as Student variables with n — d n degrees of freedom, we 
have that 

(49) p(rf {e) > f-^ fe)) < E v(uj > f-_V (i* )) < u Q 

and thus qi(£,u a ) < T~^ dn (u a /\J l \). This inequality together with (48) 
and (46) leads to (47), as \ J l \ <£ n < n/2 and n - d n = n - t n > n/2. □ 

Step 3. For all f = (F( Xl ), . . . , F(x n ))' with F G H S (L), 

(50) ll f - n ^, c5tc f|| 2 < C(s)L 2 n -2 S _ 

n 

PROOF. Note that the vector 

in 
k=l 
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belongs to V n)CS t e and therefore 

||f-n v n .f|| 2 <||f-f|| 2 



J^^(F( Xi )-F(x Jk )y 

k=l i&Jh 



< E E 

k=lieJ k 



2.s 



= uLH~ 2s . 

Noting that i n = d n > n/4, we get (50). □ 

Step 4. Assuming that n> (L/a) l / s , there exists some constant C de- 
pending on s and (3 only such that 

^n-d„.||f-rv t f ip/crs (/3/2) 

(51) — "■ cstc < C. 

n- d n 

Proof. Using the inequality due to Birge (2001) on the quantiles of 
noncentral x 2 5 we have that 



Xn-dn.atiP/ 2 ) <n-d n + a 2 + 2^(n -d n + 2a 2 ) log(2//3) + 21og(2//3). 
Setting a = ||f — ily n csto f \\/a and using (50), we derive that 
(52) X~\ >a i (/9/2)/(n - d n ) < C(J3, s). □ 

Step 5. Under the assumption of Step 4, for all t £ T n , 



ut(f,/3)</cVlog(n)<7, 
/or some constant k* depending on a, j3 and s only. 

Proof. We recall that 



1 



vtM = ^t(a)^^^X^ ) | |f _ nVnf | |2/CT2 (/3/2) + $- i (/3/2) ja. 
We conclude by using the elementary inequality 



^\l3/2)< V21og(2//3), 
and by gathering (47) and (51). □ 

We conclude the proof with this final step. 
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Step 6. There exists a constant k depending on a, (3 and s only, such 
that if n is large enough and F satisfies 



(53) 



min F(x) < —np n , 
xe[o,i] 



then there exists t* £ T n such that 

(54) (f,t*)>^(f,/3). 

Proof. Since F 6 TC S (L), under Assumption (53) there exists j £ {1, 2, . 
such that 

F{j/n) < -K Pn + Ln~ s . 
For n large enough, Ln~ s < np n /2, hence F(j/ri) < —np n /2. 



Let us take k satisfying 



where k* is defined at Step 5. 
Let us define 



(2k 



*\2s/(l+2s) 



(55) 



£{n) 



4L V/ s 

up, 



and J as the element of J^ n ^> containing j. Note that for n large enough, 
£(n)e{l,...,£ n }. 

Now, for all k £ J, since F £ 7i s (L), 

fk = F(x k ) = -F(xj) + F(xj) + F(x k ) 

< — np n /2 + L\x k — Xj\ s 
<-Kp n /2 + L£(n)- s 

< -Hp n /A 

and thus, by taking t* £ T n> \ as 

1 



we derive that 



t* 



<f,t*) 



\J\fj 



> yJ\J\K Pn /4. 

By construction of the partition of the data, we have for all positive 



integers p<q<r that 
(56) 



+ 1. 
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For all j G {1, . . .,£(n)}, J = jf n) [see (8)] is a union of |/£» (n) | > 
disjoint sets of cardinality at least [n/£ n ]. Hence 



i A n h > 



n 

> 



U(n) 



since [x] >x/2 for all x > 1. Therefore we get 

/ \ lis 

n n ( p n \ I 



(57) |j|> " >I! [ K ¥L 

using (55). 

This implies that 

(f,t)>— I — I «Pn > « ayj log{n) 
by definition of k. □ 

8.2. Proof for (T a ,C) = (T at 2,C A . We follow the proof of Theorem 1 for 
(T a ,C) = (T aj i,C>o): the results of Steps 1-5 still hold. The proof of Step 2 
differs in the following way: (49) becomes 

Ti(s) > f-} dn 



\ 1 n,2\ 



We conclude the proof of Step 2 by noticing that for all I G {1, . . . ,£ n }, \T^ 2 \ 
is bounded from above by n 2 /4. 

Step 6. For n large enough, under the assumption that 
(58) inf WF-GW^yKp^ 

there exists t* G T n ,i, such that (t* , /) > v t *(f,{3). 

Proof. Let us first remark that 

inf HF-GIU^ sup (F(x)-F(y)). 

G G/Cy 0<x<y<l 

Indeed, let G* G JC y be defined as 



G*(y) = sup F(x). 

0<x<y 
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Then 

inf \\F-G\\ 00 <\\F-G*\\ 00 = sup (F(x) - F(y)). 

GelCy- 0<x<y<l 

Hence, under (58), there exists x < y such that F(x) — F(y) > np n . Since 
F G 7i s (L), if \xi — x\ < 1/n and \xj — y\ < 1/n, then 

F(xi) - F(xj) > Kp n - 2Ln~ s > np n /2 

for n large enough. Hence, there exists 1 < i < j < n such that F(xj) — 
F{ Xj ) > K Pn /2. 
Let us set 



£(n) 



8L \ 1 I S 

KPn 



which belongs to {1, . . . ,£ n } at least for n large enough. Let I and J be the 
elements of J 1 -^ satisfying i € I and j G J. 

Arguing as in Step 6 of Section 8.1, since F G TC S (L), 

Si > F(xi) - L£(n)- S and fj < F{ Xj ) + Ll(ny s 

and we deduce that 

Si-Sj> - 2L£(ny s > K Pn /4. 

This implies that there exists 1 < i* < j* < £(n) with / = J^i™" 1 and J = , 
such that 

(eP,f) = N^(f I -fj)>N^}^. 
Using (56), and since £ n = [n/2], we have that for all K G J^ n \ 



£{n) 



< \K\ <3 



£{n) 



+ 1 



which implies that 



\i\\A >c /4 



We now conclude as in the proof of Step 6 by taking t* = e^*. □ 
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8.3. Proof of Theorem 1 for (T a ,C) = (T a ^,C^). We follow the proof of 
Theorem 1 for the case (T a ,C) = (T a ,i,C>o): the results of Steps 1-5 still 
hold. Nevertheless, the proof of Step 2 differs in the following way: (49) 
becomes 



Ti(s)>f-\ 



u, 



IT* I 



-^/(ll^-nwII/Cn-^) >T "- d A\ri\))- Ua 



We conclude the proof of Step 2 by noticing that for all t £ {1, . . . ,£ n }, \T r e l3 \ 
is bounded from above by n 3 /8. 

Step 6. For n large enough, under the assumption that 

(59) inf HF-GHoo^Kpn, 

i/iere exists t* £ T^ 3 smc/j that 

(t*,f)>v t *(f,(3). 

Proof. We decompose the proof into three parts. 

Part 1. For n Zarge enough, and all F £ 7i s {L) satisfying (59), we Ziaue 

inf ||f - g||oo > Kp n /4, 

gGC„ 

withf={F{xi),...,F{x n ))'. 

Proof. We first prove the following inequality: 

(60) inf ||F-G||oo <2Ln~ s + 3 inf ||f — g||oo- 

gg/c^_ gGC^_, 

Part 1 derives obviously from this inequality. 

For all g £ , we consider the function G s £ defined as the piecewise 
linear function such that for all i, G s (xi) = g% and such that G g is affine on 
the interval [0,0:2]. Then inffjgjc^ \F — G||oo < ||F — G g ||oo- Moreover, by 
setting xq = and go = G g (0), 

II F — G II 

W 1 ^glloo 

= sup sup \F(x) — Gg(x)\ 

i£{l,...,n} xd[xi—\,Xi\ 

< sup sup \F(x)-F(xi) + F(x i )-G s (x i )+G s (x i )-G s (x)\ 

ie{l,...,n} xe[a;i_i,aii] 

< Ln~ s + ||/ -g\\oo + sup 1 — |, 

ie{l,...,n} 
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since sup a , e r a .._ 1)a ..i \G g (xi) — G g (x)\ = \G g (xi) — G g (xj_i)| (G is linear on 
[Xi— 1 ; X In addition, noticing that \gi - g \ = \g 2 - gi\, 

sup \gi - gi-i | < sup \gi - fi + fc- fo-i + U-x ~ 9i-i I 

ie{l,—,n} i€{2,...,n} 

<2||f-g|| 00 +Ln- s . 
This concludes the proof of (60). □ 

Part 2. ForallfeW 1 , 

(61) inf ||f - g||oo < max ( fj - — — -^-fc - ^ —f k 

gGC„ l<Kj<Kn\ Xj~ — Xi X^ — Xi 

where for i£R, (x)+ = x1 x> q denotes the positive part of x. 

Proof. Let us define g* G as follows: g* = f\ and for i = 1, . . . ,n— 1, 
ffi+i = 9* + inf ( — — — , k>i\ (x i+1 - Xi). 

In words, if i^in denotes the piecewise linear function on [xi,cc n ] taking the 
value fi at X{, then g* is the vector (Gj* in (xi), . . . , Gj* in (x n ))', where Gj* in is 
the largest convex function satisfying for all u E [a;i,x n ] G* in (u) < Fn n (u). 
Note that the function G* in is also piecewise linear and satisfies that for all 
j £ {1, ... ,n} such that Fn n (xj) — G* in (xj) > 0, there exist 1 <i < j < k <n 
such that 



F)m\ x i) ~ G\i n (Xj) = fj fi fk- 



Consequently, 



|f-g*l|oo= max (F lin (xj) - G* in {xj)) 

J=l,...,n 

- I r %k Xj Xj Xi 

< max j~ -fi fk i . 

l<i<j<k<n\ Xk — Xi Xk — Xi 1 . 



£(n) = 1 + 



Part 3. Lei k' = re/4. VKe se£ 

6L \V f 

If there exist l<i<j<k<n such that 

<■ ^k Xj Xj Xi i 

Jj Ji Ik — K Prn 

Xk Xi Xfc Xi 

then there exist I = jfl n \ J = Jji and K = J^* with i* < j* < k* , such 
that 

(62) fj - V^-h - l^^/jf > reVn/4- 

xk — %i xk — xi 
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Proof. Note that 

/ « T \ l/« 

(63) /(n) > (-7—) 

\K PnJ 

and that for n large enough, ^(n) G {1, . . . ,£ n }. 

In the sequel, we shall use the following inequalities: 

yE^{I,J,K} max \xi — XiA < - . and 
L J M'es 1 1 ~ £(n) 

(64) y 

and the following notation: 

X = ^i, A = ?^^, A = / J -A/ / -(l-A)/ K . 

Xfc Xi Xj{ %I 

We bound A from below as follows: 

+ fj- fj + Xfi - A// + (1 - \)fk - (1 - \)h 

> K Pn + fj - fj + (A - X)fi - A(/ 7 - / 4 ) + (A — A)/ fc - (1 - A)(/ K - fk) 

> np n - 2max{|// - fi\,\fj - fj\, \f K ~ fk\} - |A - A||/j - f k \. 
Let us now bound from above the quantities 

\fi-fk\, max{|//-/i|,|/j-/j|,|/if-/jfc|}, |A-A|. 
Since F £ TC S (L), we have that 

(65) \fi - f k \ = \F(xi) - F(x k )\ < L\x k - Xi \ s , 
and by using (64) that 

(66) maxfl/, - ft, \fj - \f K - f k \} < L£( n y s . 
For each (l,E) e {(i,I),(j,J),(k,K)}, let 

h = x E - xi. 

We have 



A 



X X j I fofc 

l + (h k - hj)/(x k - Xj) 
l + {h k - hi)/(x k - Xi) 

l (h k - hj)/(x k - xj) - (h k - hj)/(x k - Xj) 
1 + (h k - hi)/(x k - xi) 



A 
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and as from (64) max{|/ifc — hj\, \h k — hi\} < l{n) 1 , we deduce that 

(h k - hj)/ (x k - Xi) - (h k - hi) I [x k - Xi) 



|A-A| = |A| 



1 + (h k - hi)/(x k - x^ 



< 



26 



\l-6[ 



(67) 
where 

(68) 5 = ir^r V 

i{n)\xk - Xi\ 

In order to bound S from above, note that since F G 7i s (L), 

K ! p n < fj ~ A/i - (1 - X)fk 

= X(F(xj) - F(xi)) + (1 - X)(F( Xj ) - F(x k )) 
< Lmax{|xj — Xi\ s ,\x k — xj\ s } 

and therefore 

| X k Xi | ^ Ulcixj \Xj Xi I , I X k Xj I } 

= {max{|xj — Xi\ s ,\x k — Xjl 8 }} 1 ^ 



> 



L 



Thus, we deduce by (63) and the fact that s e] 0, 1] that 

L l/s 



(69) 



5< 



1 

< -. 



(K> Pn y/°e(n) - 6" 



By gathering (65)-(67), we get 



A > K'p n - 2M(n)- s - 2L 
By using (68), (69) and (63) we finally get 



1-5 



1-6 



A = K Pn - 2L£{n)- s - 2L£(n) 

> K p n /A. 

Let us now conclude the proof of Step 6. Under the assumption that 

inf ||f - gHoo > K'p n 



□ 
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we know from (61) that there exists i < j < k such that 

Jj Ji Ik — K Pn, 

•%k %i •Ek 

and from (62) that there exist / = , J = jIt^ and K = with i* < 
j* < k* such that 

Un) v _ 



/r «W \ t\tZ{ ti ) I I \^( n ) 7 t-\ \7 \ \ i*j*k* r n 

(f,e^ fc ,) = N^ifj - Vj. fc .// - (1 - K*r k *)fx) > ^4 ■ 



Noting that for all £ £ {/, J, K} 



\E\ >2 



f 



n 



> > — - 



in 



n 



and that ||e^ fc „|| 2 < Vl^l + VI J l + VI^Ij we have that 



> 



71 



l/l" 1 + | J]" 1 + IA'1" 1 - Y Vltin) ' 
As i(n) < 2(12L/(KVn)) 1/s at least for n large enough, we deduce that 



N £(n) > 



1 n 



8(12L) 1 /" 



Consequently, we get 



n 



128L 1 /* 



> «*Jlog(n) a, 



for k suitably chosen. It remains to take t* = e^-*^* € T n ^ to complete the 
proof. 

□ 



9. Proof of Proposition 3. The proof of Proposition 3 is divided into two 
parts. In Section 9.1 we show that if (24) or (25) holds, then F F<(T (T a > 0) > 
1 — j3. The second part of the proposition is shown in Section 9.2. 

9.1. Proof of the first part of Proposition 3. We only prove the result 
under (24), the proof under (25) being almost the same. By combining (43) 
and (44) we obtain that if F is such that RF is rth times differentiable, then 
for all JC {l,...,n} there exists a sequence {q, i€X r n J r+1 } verifying 
both c\ G] minj g jXj,maxj g jXj[ and 

(70) -(f,t}) = iV7 1 £ WMtfp), 

i£2 r nJ'-+ 1 r ' 
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where Nj = Gram(l j,xj, . . . ,x r J ~ 1 ) r yj. Let i* G J such that 

mfMF)(xi)=A(F)(xi*). 

We have for all ce]xj,Xj[, 

k(F){c)<A(F){ Xi *) + u(hj). 
Besides, by taking f = (x\/R{x\), . . . , x r n /R(x n ))' in (44) we get that 

1 , 2 ,__^ ll x J-n^/ x jl' 2 



E tf(* r 

iex r nJ r 



7J 



Now, by using (70) and (24) we deduce that 

(f)t}) ^_ A(F)(^) + ,(M ^ E ^ 



-(A(F)(^)+^(M)- 



iex r nJ , -+ 1 
5 



7jr! 
>^tj(f,/3), 
and we conclude thanks to Theorem 1. 

9.2. Proof of the second part of Proposition 3. In order to prove this 
second part, we apply the first part of Proposition 3. 

Evaluation of vt* (f , (3) ■ Let us prove that for all J£\J&L X J®, 

where k* depends on a,f3,s and r only. We use Steps 1-5 in the proof of 
Proposition 1. For Steps 1, 2 and 5 the proof is similar to the proof of 
Proposition 1. 



Step 3. For all f = (F(x\), F(x n ))' with F^ G H S {L), 

(71) ll f - n Vnff <c( s ,r)L 2 n-^ s + r \ 

n 

Proof. We recall that V n is the linear space generated by 
{1j,xj,...,xS,J€ 

Note that the vector 

r 



k=l \ 1=1 
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belongs to V n . Hence, using that F<ri €H 8 (L), 

llf-nrf 
< llf-fll 2 

k=iieJ k yJui = x Jk Ju2=x Jk J v r =xj k 

k=l i£Jk 

<c(v)iV* s ' 

since t n > nj (4(r + 1)) using that [x] > x/2 for x > 1. □ 

Step 4. Assuming that n> {L/a) l ^ r+s \ there exists some constant C 
depending on s,r and (3 only such that 

( 72 ) X ra _ dwi ||f_n Vn f|p/ g 2(^/2) ^ 

n - d n 

The proof is similar to the proof of Step 4 in Proposition 1 by using (71). 

Evaluation of jj. Let us prove that there exists some constant C de- 
pending on r only such that, for J such that | J| > r + 1, 



7j>C 



JI2T+1 



Since for all i, Xi = i/n, by translation 

2 II r tt r II 2 

7j = l|xj-n^ jXj || 

1 IJI 



min yV'-a -aii a r _ii r X ) 2 . 

n ^ a ,...,a r _i ^ x 
1=1 



By setting for all j £ {0, . . . , r — 1} a,j = bj \ J\ r J , we have 

\J\ 



nun y^(i r — ao — a\i — ■ ■ ■ — a r -\f 
ao,...,a r -i f— f 
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Since 

^ m r 1 ^g(fe) ? - 6 °----- 6 - i (R) r T 

converges as | J\ —> oo toward 

min / (x r — bo — ■ ■ ■ — b r _\x r ~ 1 ) 2 dx, 
&o,...,& r _i Jo 

which is positive, we obtain that there exists some constant C > such that 
for | J | large enough, 

I T\ 2r + 1 

Moreover, since for \J\ > r + 1, 7j > 0, the above inequality holds for \J\ > 
r + 1, possibly enlarging C. 

Evaluation of uj(hj). Let J G Since G 7i s (L), and since hj 

defined in Theorem 3 satisfies < hj < l/£, 

u{hj)= sup \ F <r\x)-F<r\y)\ 

\x—y\<hj 

< LC S . 

Conclusion. Let us prove in conclusion that if 
(73) inf F^(x)<-p n>r , 

£€[0,1] 

then (24) holds for some J G lj|=i J® ■ 

Since F^ G H S (L) under (73), there exists j G {1, . . . ,n} such that 
F {r \ Xj ) < -p n , r + LrT s < -p n , r /2 



for n large enough. 
Let 



£(n) 



L 2 n v l/(l+2r+2 S )-| 



,<t 2 log(n) 

For n large enough, £(n) G {1, . . . ,£ n }- Let J be the element of J^ n ^ con- 
taining j. Note that | J| > n/(2£(n)) at least for n large enough. This implies 
that, for n large enough, 



I T\ 1+2r 
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It follows that 



V 



Vr r (f,P)—+u(hj) < ^==oJ\og(n) 



\t{n)) 



r+l/2 



+ L(£(n)Y 



for some constant k depending on ct,j3,s and r. This concludes the proof of 
the proposition. 



APPENDIX 



A.l. Proof of Lemma 2. 



Proof of (42). Clearly, one has C r< R C C t ,r- We prove C r> R C C r ,_R by 
using repeatedly the following claim. 



Claim 1 . Let < u\ < U2 < ■ • ■ < u r+ \ < u r+ 2 < 1 be an increasing se- 
quence of r + 2 points of [0,1]. Let v \ , . . . , v r+ 2 be real numbers verifying 
that 

1 U 3 



Di(l,u r+ 2, ■ ■ ■ ,u r r+2 ,v r+ 2) =det 



1 V 2 \ 
-1 



Vi 



Ur+2 



U r r+ \ V r+2 



> 



and 



D r+ 2 = det 



/I ui 

1 U 2 



II 



.r-1 



1 

r-1 
U 2 V 2 



Vl \ 



V 1 U r+ \ 
Then for all j G {2, . . . , r + 1} 



1^(1,1^+2, ...,i£ +2 , u r+2 ) =det 



r-1 

V+l v r- 



> 0. 



fr+1 / 



r-1 



1 U 

1 u 



i-l 
j+i 



Vl 



«r+2 



r-1 
^-1 U i-1 



r— 1 / 

U r+2 V r+2 J 



> 0. 



Proof. For real numbers ti,...,t r we denote by vand(ii, . . . ,t r ) the 
Vandermonde determinant 



/l ti 



vand^i, . . . ,t r ) = det 



t 



r-1- 



Vi i r ••• t: 



r-1 
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and for j = 1, . . . , r + 2 we denote by Uj the vector (1, Uj, . . . , u r ~ l , vj)' . Let 
us fix j G {2, . . . , r + 1}. By expanding the determinant 

Dj(l,u r+ 2, ■ ■ ■ ,u r r +l,v r+2 ) 
by its last column, we get that if j G {2, . . . , r}, 

Dj(l,U r+2 , ...,U T r +l,V r+ 2) 

= v r+2 vand(tii , . . . , Uj-\ , Uj+i , . . . , u r+ x ) + Dj(l, u r+2 u r r ~\ , 0) , 

and if j = r + 1 , 

D r+1 (l,u r+2 , ■ ■ . ,u r r ~l,v r+ 2) 

= v r+2 vand(ui, . . . , u r ) + D r+1 (l,u r+2 , . . . , u r r ~\, 0). 

Since the Uj's are increasing, the Vandermonde determinants are positive 
and therefore Dj(l, u r+ 2, u r+2> v r+2) is increasing with respect to t> r +2- 
On the other hand, since by assumption 

r-l 
r+2' 



L»i(l,n r+2 ,...,< + 2,w r+2 ) 



= v r+ 2vand(u 2 , . . . ,u r+ i) + Di(l,u r+2 , ■ ■ ■ ,u r r ~ 2 ,0) > 

we have that 

L>i(l,u r+2 ,...,<-^,0) 

«r+2 > -r, = v , 

vand(u 2 , . . . ,u r+ i) 



and deduce that 



Dj(l,U r+2 , • • • X +2 ,fr+2) > Dj(l,U r+2 



, . . . , « r _|_2; 



It remains to show that Dj(l, u r+ 2, ■ ■ ■ , v r+2, v*) > 0. When u r+2 = f* 5 



we have that D\(l, u r+ 2, ■ ■ ■ , w£+2' w *) = an d therefore u* = (1, u r+ 2 
is a linear combination of 112, . . . ,u r +i. Let us denote by X k the coordinate 
of u* on Ufc. By Cramer's formula we have that for k G {3, . . . , r} 

vand(u 2 , • • • ,^-1,^+1, . . . ,u r+2 ) 
vand(u 2 , . . .,u k -i,u k +i, ■ ■ .,u r+ i,u k ) 

- f -^y-k+l ™ 11 ^^, ■ ■ -,Uk-l,U k+1 , . . .,U r+2 ) 

vand(u 2 , • • • , u r +i) 

. , ^vand^,...,?/^) 

A2 = (-lj -r, r 

vand(u 2 , . . . ,u r+ i) 

vand(u2,---,w r ,u r+2 ) 



, . . . , U r _y_2-, V 



and 



A 



r+1 



vand(u2, . . . , u r +\) 
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Hence, the positivity of the Vandermonde determinants implies that Xj has 
the sign of (— l) r ~ jf+1 . Since u* = Y7k=2 AfcUfc, by linearity of the determinant 

Dj(l, u r+2 , • • • , u 7 r ~l, v*) = XjDj(l, uj,..., u'j' 1 , Vj) 

= (-l) r - i+1 A^ r+1 

and thus, as D r+ \ > 0, Dj(l,u r+ 2, ■ ■ ■ ,u r r +l,v*) > 0. □ 

The proof of (42) is complete. □ 



Proof of (43). For x G [a^, a^+J let us set 

rr.r-1 



h(x) = det 



/l x 

1 %in 



\1 



-r-1 



R(x)F{x) \ 
i?(x i2 )F(xi 2 ) 

R(x ir+1 )F(x ir+1 )J 



/l x 



- A det 



1 x 



'2 



c r-l 



12 



' !2 



V 1 ^ir + l 



v — 1 r i 



where A is such that h{x\) = 0. Since h is r-times differentiable and satisfies 



/i(xij = h(x i2 ) 
that 



h{xi r+ i) = 0, there exists some c\ £]xi 1 ,Xi r+1 [ such 



= /i( r )( Ci ) = det 



/0 

1 Xin 



-r-1 



A(F)( Ci ) ^ 
i?(x i2 )F(j; i2 ) 



•r, 



r-l 



L : fl(x !r+1 )F(x ir+1 )/ 



A det 



/0 

1 Xj i0 



V 1 X ir + 1 







'_> 



r— 1 r j 



leading to A = A(F)(ci)/r\. We get the result by substituting the expression 
of A in the equality h{x\) =0. □ 



Proof of (44). We start with the following claim. 
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Claim 2. Let W be a linear subspace ofM, k of dimension q € {1, . 
1} and let {w 1 , . . . , w 9 } be a basis of W. Then for all u, v in M fc 

Gram(w 1 , . . . , w 9 )(u, (/ - n w )v) 



(74) 



iex 9 +i 



4 9 + l 



x det 



Wl Vi 



ii 



lq + 1 



W i q+l ! 



where 



Gram(w 1 ,...,w ,? ) = det(G) with G = ((w i ,w J '))i< iii <g. 



We conclude thanks to Claim 2 by taking u = i?* f, v = x j — II^Xj, 
W = ^j and fc = |J|. 



Proof of Claim 2. For z £R k , let f?(z) the ix (q + 1) matrix 

/ tr 1 

We obtain the result by computing 
det(J3(u)'J3(v))=det 

(w y , w ) 
V (u,w x ) 

by two different ways. The first way is direct: since LTwv is a linear combi- 
nation of the w J 's we have 

det(B(u)'B(v)) 



wl Z\ \ 






w\ Z k J 






... ( W \w«) 


(wi,v)\ 


... ( w «,w«) 




••• (u 


,w«> 


(u,v) J 



det 



det 



/(w 1 


v> • 


.. (w 1 






,'w 1 ) • 


.. ( W 9 


,w« 


^ (u, 


w 1 ) • 


•• (u, 


w«) 


/(w 1 


.w 1 ) • 


.. (w 1 


,w* 


(w 9 


,'w 1 ) • 


.. (w 9 


,w" 


I (u, 


w 1 ) • 


•• (u, 


w 9 ) 





(u,w«) (u,(i-n w )v)7 
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= Gram(w\ . . . , w g )(u, (I - U w )v). 

The other way is to use the Cauchy-Binet formula [see Horn and John- 
son (1991)]: we calculate det(B (u.y B (v)) as a function of the (q + 1) x (q + 1) 
minors of the matrix B(u) and -B(v) which leads to the right-hand side 
of (74) and concludes the proof. □ 

The proof of (44) is complete. □ 

A.2. Proof of Proposition 2. 

Case K, = /C>o- Let F F be the law of Y under the model defined by (16). 
Let $ be a test of level a of the hypothesis F G /C>o- Let us define the test \E r 
of the hypothesis U F = 0" against U F / 0" which rejects the null if &(Y) = 1 
or if $(— Y) = 1. Since G /C>o and since 

F ($(Y) = 1)=F ($(-Y) = l)<a, 

the test ^ is of level 2a < 3a. Let pn^^T) be the A-uniform separation 
rate of <I> over T . It is enough to show that 

To do so, we show that the || • ||oo-uniform separation rate of ^ over T is 
not larger than p n (&, which means that for all F G T such that ||-F||oo > 
pn($,.F) we have F F (^(Y) = 1) > 1 - (3. 
Let F G T. If H^Hoo > p n (^,^), then 

either A(F) = sup (-F(x)l F{x)>0 ) > p n ($,JF) or A(-F) > p„($,^). 

zG[0,l] 

In the first case, by definition of p n (^ ; -^) we have Fp($>(Y) = 1) > 1 — /3 and 
consequently F F (^(Y) = 1) > 1 — j3. Note that in the other case the same is 
true since by symmetry of the law of Y — F 

F F ($(-Y) = 1) = F_ F (<S>(Y) = 1). 

Case K, = K, y . We argue similarly. Let $ be a test of level a of the 
hypothesis F G fCy. We also consider the test <E>' of level a of U F = 0" 
against U F / 0" which rejects the null when \/n\jQ dY(t)\ is large enough 
(namely, larger than the 1 — a quantile of a standard Gaussian random 
variable). Finally, we define the test of the hypothesis a F = 0" against 
U F / 0" which rejects the null if <f>(Y) = 1 or <f>(-Y) = 1 or $'(Y) = 1. Since 
G ICy, we have that the so-defined test ^ is of level 3a. 

Some easy computations show that there exists some constant k depend- 
ing on a and (3 only such that rejects the null with probability not smaller 
than 1 — (3 as soon as \ Jq F(t) dt\ is larger than na / 1 \fn (the sum of the [3 
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and 1 — a quantiles of a standard Gaussian suits for k). On the other hand, 
note that 

A(F) = | sup (F(s)-F(t)) 

0<s<t<l 

and thus, by definition of the A-separation rate, p n {^,J-), of $ over ^> 
rejects the null with probability not smaller than 1 — J3 under all alternatives 
F £ T satisfying 

max{A(F),A(-F)} = ± sup \F(t) - F{s)\ > p n ($,JF). 



Therefore, since 



|-F||oo < sup 
te[o,i] 



F(t) 



0<t,s<l 



F(s)ds 



+ 



F(s)ds 



< f sup \F(t) - F(s)\ds+ f F(s)ds 
Jo ie[o,i] Jo 



< sup \F(t)-F(s)\ + 
t,se[o,i] 



F{s)ds 



^> rejects the null with probability larger than 1 — f3 under all alternative J- 
such that 

Halloo > 2p n ($,^) + K(7/x/n, 



and the result follows. 
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